244 research outputs found

    Visualization of comparative genomic analyses by BLAST score ratio

    Get PDF
    BACKGROUND: The first microbial genome sequence, Haemophilus influenzae, was published in 1995. Since then, more than 400 microbial genome sequences have been completed or commenced. This massive influx of data provides the opportunity to obtain biological insights through comparative genomics. However few tools are available for this scale of comparative analysis. RESULTS: The BLAST Score Ratio (BSR) approach, implemented in a Perl script, classifies all putative peptides within three genomes using a measure of similarity based on the ratio of BLAST scores. The output of the BSR analysis enables global visualization of the degree of proteome similarity between all three genomes. Additional output enables the genomic synteny (conserved gene order) between each genome pair to be assessed. Furthermore, we extend this synteny analysis by overlaying BSR data as a color dimension, enabling visualization of the degree of similarity of the peptides being compared. CONCLUSIONS: Combining the degree of similarity, synteny and annotation will allow rapid identification of conserved genomic regions as well as a number of common genomic rearrangements such as insertions, deletions and inversions. The script and example visualizations are available at:

    Comparative genomic analysis and molecular examination of the diversity of enterotoxigenic Escherichia coli isolates from Chile

    Get PDF
    Enterotoxigenic Escherichia coli (ETEC) is one of the most common diarrheal pathogens in the low- and middle-income regions of the world, however a systematic examination of the genomic content of isolates from Chile has not yet been undertaken. Whole genome sequencing and comparative analysis of a collection of 125 ETEC isolates from three geographic locations in Chile, allowed the interrogation of phylogenomic groups, sequence types and genes specific to isolates from the different geographic locations. A total of 80.8% (101/125) of the ETEC isolates were identified in E. coli phylogroup A, 15.2% (19/125) in phylogroup B, and 4.0% (5/125) in phylogroup E. The over-representation of genomes in phylogroup A was significantly different from other global ETEC genomic studies. The Chilean ETEC isolates could be further subdivided into sub-clades similar to previously defined global ETEC reference lineages that had conserved multi-locus sequence types and toxin profiles. Comparison of the gene content of the Chilean ETEC identified genes that were unique based on geographic location within Chile, phylogenomic classifications or sequence type. Completion of a limited number of genomes provided insight into the ETEC plasmid content, which is conserved in some phylogenomic groups and not conserved in others. These findings suggest that the Chilean ETEC isolates contain unique virulence factor combinations and genomic content compared to global reference ETEC isolates

    The large-scale blast score ratio (LS-BSR) pipeline: a method to rapidly compare genetic content between bacterial genomes

    Get PDF
    Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR) pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs) in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the rapid, large-scale, full-genome comparative analyses carried out by LS-BSR. Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 min using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP) based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar) designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in 27-57 h, depending upon the alignment method, using 16 processors. Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be translated into clinical diagnostics, or can be used to identify broadly conserved putative therapeutic candidates

    Draft genome sequences of five recent human uropathogenic Escherichia coli isolates

    Full text link
    This study reports the release of draft genome sequences of five isolates of uropathogenic Escherichia coli (UPEC), isolated from patients suffering from uncomplicated cystitis in 2012 in Ann Arbor, Michigan. Phylogenetic analyses revealed that these strains belonged to E. coli phylogroups B2 and D and are closely related to known UPEC strains. Comparative genomic analysis revealed that more conserved proteins were shared between these recent isolates and UPEC strains causing cystitis than those causing pyelonephritis. Additional genomic comparisons identified that three isolates encode a type III secretion system (T3SS) and a putative T3SS effector gene cluster along with an invasin‐like outer membrane protein. The presence of T3SS genes is a rare occurrence among UPEC strains. These genomes further substantiate the heterogeneity of the gene pool of UPEC and provide a foundation for comparative genomic studies using recent clinical isolates.This publication briefly describes the draft genomes of five recent human uropathogenic (UPEC) Escherichia coli isolates. UPEC are of increasing importance to human health. The genomes of these new isolates are clearly and simply described and will be of great utility and interest to this research community.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/136326/1/fim12059.pd

    Conservation and immunogenicity of novel antigens in diverse isolates of enterotoxigenic Escherichia coli

    Get PDF
    BACKGROUND:Enterotoxigenic Escherichia coli (ETEC) are common causes of diarrheal morbidity and mortality in developing countries for which there is currently no vaccine. Heterogeneity in classical ETEC antigens known as colonization factors (CFs) and poor efficacy of toxoid-based approaches to date have impeded development of a broadly protective ETEC vaccine, prompting searches for novel molecular targets. METHODOLOGY:Using a variety of molecular methods, we examined a large collection of ETEC isolates for production of two secreted plasmid-encoded pathotype-specific antigens, the EtpA extracellular adhesin, and EatA, a mucin-degrading serine protease; and two chromosomally-encoded molecules, the YghJ metalloprotease and the EaeH adhesin, that are not specific to the ETEC pathovar, but which have been implicated in ETEC pathogenesis. ELISA assays were also performed on control and convalescent sera to characterize the immune response to these antigens. Finally, mice were immunized with recombinant EtpA (rEtpA), and a protease deficient version of the secreted EatA passenger domain (rEatApH134R) to examine the feasibility of combining these molecules in a subunit vaccine approach. PRINCIPAL FINDINGS:EtpA and EatA were secreted by more than half of all ETEC, distributed over diverse phylogenetic lineages belonging to multiple CF groups, and exhibited surprisingly little sequence variation. Both chromosomally-encoded molecules were also identified in a wide variety of ETEC strains and YghJ was secreted by 89% of isolates. Antibodies against both the ETEC pathovar-specific and conserved E. coli antigens were present in significantly higher titers in convalescent samples from subjects with ETEC infection than controls suggesting that each of these antigens is produced and recognized during infection. Finally, co-immunization of mice with rEtpA and rEatApH134R offered significant protection against ETEC infection. CONCLUSIONS:Collectively, these data suggest that novel antigens could significantly complement current approaches and foster improved strategies for development of broadly protective ETEC vaccines

    Enterotoxigenic Escherichia coli secretes a highly conserved mucin-degrading metalloprotease to effectively engage intestinal epithelial cells

    Get PDF
    Enterotoxigenic Escherichia coli (ETEC) is a leading cause of death due to diarrheal illness among young children in developing countries, and there is currently no effective vaccine. Many elements of ETEC pathogenesis are still poorly defined. Here we demonstrate that YghJ, a secreted ETEC antigen identified in immunoproteomic studies using convalescent patient sera, is required for efficient access to small intestinal enterocytes and for the optimal delivery of heat-labile toxin (LT). Furthermore, YghJ is a highly conserved metalloprotease that influences intestinal colonization of ETEC by degrading the major mucins in the small intestine, MUC2 and MUC3. Genes encoding YghJ and its cognate type II secretion system (T2SS), which also secretes LT, are highly conserved in ETEC and exist in other enteric pathogens, including other diarrheagenic E. coli and Vibrio cholerae bacteria, suggesting that this mucin-degrading enzyme may represent a shared virulence feature of these important pathogens
    corecore